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|  ABSTRACT 

In  this  paper  subset  selection  procedures  for  selecting  all  treat¬ 
ment  populations  with  means  larger  than  a  control  population  are  pro¬ 
posed.  The  treatments  and  control  are  assumed  to  have  a  multivariate 
normal  distribution.  Various  covariance  structures  are  considered.  All 
of  the  proposed  procedures  are  easily  implemented  using  existing  tables 
of  the  multivariate  normal  and  multivariate  t  distributions.  Some  other 
procedures  which  have  been  proposed  require  extensive  and  unavailable 
tables  for  their  implementation. 


SELECTING  ALL  TREAT?  fENTS  BETTER  THAN 
A  CONTROL  USING  EXISTING  TABLES 


1.  INTRODUCTION 

Let  IIj,  ...»  IIj,  denote  k  (k  z.  1)  treatment  populations  with  means 

jjj,  ...»  and  let  denote  a  control  population  with  mean  pQ.  It  will 

be  assumed  that  I!.,  ....  n,  have  a  multivariate  normal  distribution.  Treat - 
O'*  k 

ment  population  ni  is  said  to  be  better  than  the  control  if  jk  a  p0.  The 
goal  is  to  select  a  subset  of  the  treatment  populations  which  contains  all 
populations  which  are  better  than  the  control.  A  correct  selection  (CS)  is 
the  selection  of  any  subset  which  contains  all  the  treatments  which  are 
better  than  the  control.  In  this  paper,  selection  procedures  are  proposed 
which  insure  that  the  probability  of  a  correct  selection,  P^(CS),  is  at 
least  P*,  regardless  of  the  true  value  cf  _p  *  (p0,  ...»  pfc),  where  P*  is 
a  preassigned  constant  satisfying  0  <  P*  <  1.  The  requirement  that 
?^(CS)  2  P*  for  all  ^  is  called  the  P*-condirion.  The  procedures  proposed 
in  this  paper  are  easily  implemented  since  any  critical  values  needed  can 
be  obtained  from  existing  tables  of  the  multivariate  normal  distribution 
(e.g.,  Gupta,  Nagel,  and  P an ch spake*? an  (1.173))  and  multivariate  t  distri¬ 
bution  (e.g.,  Krishr.aiah  and  Armitage  (1966)). 

Paulson  (1952)  and  rennet t  (1955)  were  among  the  first  authors  to  con¬ 
sider  treatment  versus  control  comparison  problems.  Gupta  and  Sobel  (1958) 
introduced  the  subset  selection  formulation  which  is  being  considered 
herein.  Recently,  Chen  (1980)  and  Chen  and  Pickett  (1980)  have  considered 
the  subset  selection  formulation  for  the  case  of  dependent  populations. 

These  authors  have  pointed  out  the  importance  of  dependence  in  repeated 
measures  designs. 
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This  work  is  closely  related  to  Chen  (1980).  It  differs  from  Chen's 
in  that  some  covariance  structures  are  considered  which  Chen  did  not 
consider.  In  particular,  this  paper  considers  situations  in  which  the 
control  variance  differs  from  the  treatment  variance.  The  selection  pro¬ 
cedures  in  this  paper  are  the  same  as  the  procedures  proposed  by  Chen 
in  those  situations  when  the  same  model  is  being  considered.  But  the 
procedures  are  written  in  a  slightly  different  form.  This  modified  form 
has  the  advantage  that  existing  tables  for  the  multivariate  normal  and 
t  distributions  can  now  be  used  to  implement  the  procedures.  Thus,  this 
work,  in  addition  to  proposing  new  selection  procedures,  should  make  some 
of  Chen's  procedures  much  easier  to  use.  Existing  tables  can  be  used  to 
implement  the  procedures  for  a  wider  range  of  models  than  the  range  of 
models  for  which  tables  were  provided  by  Chen. 

The  following  notation  will  be  used.  Y  'v  MN(m,  \i,  E)  means  the  random 
vector  Y  has  an  m-dimensional  multivariate  normal  distribution  with  mean 
vector  £  and  covariance  matrix  E.  $(z)  and  <(i(z)  denote  the  distribution  and 
density  function  of  the  standard  univariate  normal  distribution.  *k(Zj, . . . ,1^  - 
denotes  the  distribution  function  of  the  k-variate  standard  normal  distribu¬ 
tion  with  zero  means,  unit  variances  and  all  correlations  equal  top. 

Fk  v(tl’  *•*»  tk’  denotes  distribution  function  of  the  k-variate  cen¬ 
tral  t  distribution  with  v  degrees  of  freedom  and  all  correlations  equal  to 
p. 


awr.jr 


2.  KNOWN  COVARIANCE  CASE 


In  this  section  assume  X  MN(k  1,  £,  V)  where  X  •  (XQ, . . .  *V; 

JM  *  (yft,...,wv)  is  unknown  but  V  -  (v^;  i,  j  ■  0,...,k)  is  known.  Further 


O  k 
assune  V  has  the  form  v 


00 


v0  »  V11 


kk 


v2.  v, 


01 


Ok 


a. 


and  v„  «  b  for  i  *  j,  i,j  *  l,...,k.  Typically  Xj,  the  observation  from 
ni,  will  be  a  sample  mean  as  the  examples  at  this  section's  end  illustrate 
but  for  now  only  the  single  vector  observation  X  is  considered.  The  k  treat¬ 
ment  populations  are  all  assumed  to  have  equal  variances  and  covariances 

2 

but  the  variance  of  the  control,  v^  ,  may  be  different  and  the  covariance 

between  the  control  and  a  treatment,  a,  need  not  equal  the  covariance  between 

2  2 

tv>o  treatments,  b.  Chen  (1980)  only  considered  the  case  in  which  vQ  »  v  . 
But  in  some  situations  much  more  data  is  available  on  the  control  than  on 
the  treatments.  In  these  situations,  it  will  usually  be  the  case  that 

V  <  v2. 


2.1  Selection  Procedure 

Procedure  R^:  Include  population  Jl^  in  the  selected  subset  if  and  only  if 

xi  *  x0  "  cl'/v02  ♦  vZ  -  2a  (2.1) 

where  Cj  is  chosen  to  satisfy  (2.2). 

Theorem  1:  For  a  given  P*.  if  Cj  is  chosen  to  satisfy 

*k(cl.**..ci;  p)  ■  P*  (2.2) 

2  2  2 

where  p  •  (vQ  ♦  b  -  2a)/(vQ  ♦  v  -  2a),  then  Rj  satisfies  the  P»-condition. 

Proof:  This  proof  is  similar  to  the  proof  of  Theorem  1  in  Chen  (1980). 
It  is  included  here  for  completeness. 

Fix  j4  ■  (pQ,...,vk).  Let  ij,...,ig  denote  the  subscripts  of  the  B 
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populations  which  are  better  than  the  control.  Let  Z^  »  (XQ  -  X^  -  (yQ  -  y^))/ 


Aq*  +  v  -  2a  ,  i  ■  l,...,k.  Then 

P^CJSlRj)  -  P^select  n^,  j  -  1.....B) 

-  P^  iXfl-  Cjn/vq2  ♦  v2  -  2a  ,  j  -  1,...,B) 

.  P^Z..  s  Cj  +  (y^  -  vQ)/J vQJ  ♦  vZ  -  2a,  j  -  1,...,B) 

*  Pu(Zi  *  V  j  "  l***«.B) 

*■  J 

*  p^Zi  s  ci»  *  *  l»...,k). 

The  first  inequality  is  true  since  y.  a  y  ,  j  «  1,...,B. 

lj  ° 

2  m  *v  MN(k,0,  R)  where  R  ■  (r^),  ru  *  i»  i  ■  1,...,  k.  and 

■  P»  i  *  j.  i.j  -  l,...,k.  By  (2.2),  P^Zjd  Cj,  i  -  l,...,k)  •  P *. 

Since  p.  was  arbitrary,  Rj  s  tisfies  the  P*- condition. | | 

2.2  Tables  for 

The  constant  which  depends  on  k,  P*,  and  p  is  the  value  which  is 
tabulated  in  Table  1  of  Gupta,  Nagel  and  Panchapakesan  (3;?73).  The  corres¬ 
pondence  of  notation  is  N  ■  k,  a  ■  1  -  P*  and  o  r  '  where  the  notation  on 
the  left  o_  each  equality  is  the  Gupta,  Nagel  and  Panchapakesan  notation 
and  the  notation  on  the  right  of  each  equality  is  the  notation  of  this  paper. 
This  table  covers  P*  ■  .75,  .90,  .975  and  .99,  all  k  values  between  1  and  10 
and  all  even  k  values  between  12  and  50,  and  17  different  p  values  between 
.1  and  .9.  This  table  and  interpolation  therein  seen  to  be  adequate  for  the 
k  and  p  values  used  in  most  applications.  If  other  P*  values  are  used. 

Table  II  of  Gupta  (1963)  can  be  used.  Here  the  correspondence  of  notation 
i»  H  ■  Cj,  N  »  k,  .?  «  p  and  the  tabled  value  is  P*  where  again  the  left  side 


s 


of  each  equality  is  Gupta' a  notation  and  the  right  side  is  the  notation 
of  this  paper. 

If  the  value  of  for  other  values  of  k,  P*  and  p  is  needed,  then 
Cj  can  be  found  by  numerical  methods  as  the  solution  of  the  equality 

/  *  c.)/  /T^p)  d*(x)  -  P*  (2.3) 

-at 

(see  Gupta,  Nagel  and  Panchapakesan  (1973)).  Solving  (2,3)  should  be  more 
efficient  than  solving  equation  (3.3)  of  Chen  (1980)  since  (2.3)  involves 
only  a  single  integral  whereas  Chen's  equation  involves  a  double  integral. 

2.3  Examples 

In  the  following  examples,  some  special  cases  of  the  general  model  are 

considered.  These  examples  illustrate  some  of  the  situations  to  which  the 

general  model  applies.  It  should  be  remembered  that  in  all  these  examples 

the  procedures  can  be  implemented  easily  since  the  constant  Cj  can  be  obtained 

from  Table  1  of  Gupta,  Nagel  and  Panchapakesan  (1973). 

Example  1;  Let  be  independent.  YA  *  MN(k  ♦  1,  n,  I)  where 

2 

l  ■  (°ijl  i»  i  *  0,...,k).  Further  assume  l  has  the  form  Opp*  oQ  ,  Ojj  ■ 

—  ■  "kk  -  ”01  •  —  •  «0k  *  •  “d  °tj  J,  l.  J  -  1 . k. 

Let  X  be  the  sample  mean  of  Then  X  MN(jc  ♦  1,  jb>  V)  where 

2  2  2  2 

V0  »  oQ  /n,  v  ■  a  / n,  a  ■  o/n  and  b  »  8/n.  The  procedure  Rj  becomes  select 
if  and  only  if 

Xj  st  Xp  -  Cj  y/i^*  o'1  -  2o)/n  (2.4) 

2  2  2 

where  Cj  is  determined  by  (2.2)  with  p»  (oQ  ♦  0  -  2a)/ (Op  ♦  o  -  2a). 

The  case  in  which  Op  ■  a  ■  0  is  of  special  interest.  In  this  case,  XQ 
equals  tip  with  probability  one.  That  is  to  say,  this  is  the  case  in  which 
the  control  mean  Up  i*  known.  In  this  case,  Rj  is  select  if  and  only  if 
xt  i  pp  -  clo//“n  (2.5) 


MM 
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2 

where  c^  is  determined  by  (2.2)  with  p  ■  B/o  ,  the  correlation  between 
any  two  treatment  populations.  This  procedure  is  the  procedure  Pj  pro¬ 
posed  by  Chen  (1980)  for  the  wQ  known  case. 

Example  2:  Assume  the  same  model  as  in  Example  1.  Assume  further  that 
oQ  ■  o  and  a  ■  8.  Table  I  in  Chen  (1980)  was  provided  for  this  equal 
variance  and  equal  covariance  case.  Let  y  *  a/a  be  the  common  known 
correlation.  The  procedure  Rj  becomes  select  IK  if  and  only  if 

xi  *  x0  “  CjO  /2(1  -  y )/«  (2.6) 

where  c^  is  determined  by  (2.2)  with  p  ■  1/2.  Comparing  R^  with  the  pro¬ 
cedure  ?2  proposed  by  Chen  (1980)  for  this  case,  they  are  found  to  be  the 
same  when  the  identification  d2  ■  Cjs/2(1  -  y)  is  made.  d2  is  the  constant 
tabled  by  Chen.  The  advantage  of  writing  the  procedure  in  the  form  (2.6) 
is  that,  whereas  Chen  required  a  separate  table  entry  for  each  value  of 
Y(p  in  Chen's  notation) ,  only  the  Gupta,  Nagel  and  Panchapakesan  (1973) 
table  for  p  ■  1/2  is  needed  to  determine  c^ ,  regardless  of  the  value  of  y. 

The  form  (2.6)  and  the  Gupta,  Nagel  and  Panchapakesan  table  might  also  be 
preferred  since  this  table  provides  four  decimal  places  for  Cj  whereas 
Chen's  Table  I  provides  only  two  decimal  places  for  d?. 

Example  3:  Procedure  Rj  can  be  used  in  the  situation  in  which  there  are 
separate  sables  of  different  sixes  on  the  control  and  treatment  populations. 

It  can  be  used  if  in  addition  there  is  a  joint  sample  on  the  control  and 

treatment  populations.  Let  Y 1 . ,  Y^  be  defined  as  in  Example  1.  Let 

nj,  m2  and  m^  be  non-negative  integers  with  “j  ♦  *2  *  *3  ■  n.  Let  r  ■  *j  ♦  *2 

and  s  •  m.  ♦  m..  let  X  -  I  Yfl./r  and  x,  •(  I1  Y. .  ♦  I  Y.  .)/s,  i  -  1 . k. 

13  0  j«l  03  1  j-1  13  j«r*l  13 
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The  sample  size  for  the  joint  sample  of  the  treatments  and  the  control  is 

■j.  The  sample  sizes  of  the  additional  samples  on  the  control  and  the 

treatments  are  m?  and  n>3  respectively.  Then  X  ^  MN(k  ♦  1,  yt,  V)  where 
2  2  2  2 

v0  *  °o  /*»  v  ■  o  /s,  a  ■  mjo/rs  and  b  ■  e/s.  For  this  model,  Rj  becomes 
select  nA  if  and  only  if 

L — 5 - 5 - 

*i  4  x0  ■  Cj  y(so0  ♦  ro  -  2m1<»)/rs  (2.7) 

where  Cj  is  determined  by  (2.2)  with  p»  (soQ2  ♦  r$  -  2mjO)/(so02  ♦  ro2  -  2sya)* 
A  case  of  particular  interest  is  the  case  m1  »  0.  This  is  the  case  in 
which  th«f«  is  a  sample  of  size  m^  from  the  control  population  and  an  inde¬ 
pendent  sample  of  size  m^  from  the  treatment  populations.  If  in  addition 
the  treatment  populations  are  independent,  then  Rj  reduces  to  the  procedure 
proposed  by  Gupta  and  Sobel  (19S8)  (equation  (3.10))  if  the  identification 
is  made  that  d  «  c j^/m^Cg  ♦  m2o2 where  d  is  a  constant  defined  by 
Gupta  and  Sobel.  The  Gupta  and  Sobel  procedure  may  be  used  when  there  are 
unequal  saaple  sizes  on  the  various  treatments,  a  situation  not  covered  by 
the  model  presented  here. 


UNKNOWN  VARIANCE,  KNOWN  CORRELATION  CASE 


In  this  section  the  case  in  which  the  treatments  and  control  have 

a  common  unknown  variance  and  known  correlations  is  considered. 

2 

Let  Yj,...,Yn  be  independent.  Yj  MN(k  ♦  1,  R)  where  jj_  ■  (iu,..., 
2 

and  a  are  unknown  but  R  ■  (rAj J  i,  j  ■  0,...  ,k)  is  known  and  has  the  form 

r00  "  *•*  *  rkk  *  r01  “  "  r0k  ■  *0  and  r^  «  r  for  i  *  j,  i,  j  -  1... 

Let  X  -  (Xq,...^)  be  the  saayle  mean  of  l1»***»Yn*  Let  S  «  (s^;  i,  j  «  0 

be  the  usual  unbiased  sample  covariance  matrix,  i.e., 

n  7 
5i)  *  *  frm  -  ^(y)m  -  Xj)/(n  -  1),  i,  j  ■  0,...,k.  An  estimate  of  a 

which  will  be  used  is  SQ2  ■  tr(R  *S)/(k  ♦  1).  It  is  known  (see  Anderson 

(1958))  that  SQ2  is  independent  of  X  and  (k  ♦  1)  (n  -  1)  SQ2/o2  has  a  chi- 

squared  distribution  with  v  ■  (k  ♦  1)  (n  -  1)  degrees  of  freedom.  For 

computational  purposes,  it  sh„  ild  be  noted  that 

k  k 

ds  ♦  2®(  *  *i0)  ♦  f(  E  *ii)  +  2g(  E  s..) 

tree's) .  — — _ ia_ii _ iasJL. 

1  ♦  (k  -  l)r  -  hr,,1 

where  d  ■  1  ♦  (k  -  1)?  ,  e  ■  -rQ,  f  "  (rQ2-r)/(l  -  r)  and 

f  -  1  ♦  (k  -  1  -  r02)r  -  (k  -  l)r02  -  (k  -  l)r(r02  -  r)/(l  -  r). 

3.1  Selection  Procedure 

Procedure  R2:  Include  population  in  the  selected  subset  if  and  only  if 
*i  *  x0  -  c2*o  \/(2  -  2rQ)/n  (3.1) 

where  c.  is  chosen  to  satisfy  (3.2). 


(3.2) 


Theorem  2:  For  a  given  P*,  if  c2  is  chosen  to  satisfy 
Fk,  v  (V">c2i  p)  *  P* 

where  p  ■  (1  ♦  r  -  2rQ)/(2  -  2rQ)  and  v  ■  (k  +  l)(n  -  1),  then  R2  satisfies 
the  Precondition. 

2 

Proof:  Fix  £  ■  and  a  .  Let 

Zi  "  (Xo  "  Xi  “(u0  ’  wi)5/  >/(2  -  2r0)/n,  i  -  l,...,k.  Let  Tj  -  Zj/Sg. 

Then  1  *  (Zj, . . . ,  Z^)  *  MN(k,0,V)  where  V  ■  (v^i,  j  «  l,...,k),  v^  ■  o2, 

2  2  2 
i  «  l,...,k  and  v„  »  o  p,  i  *  j ,  i,  j  »  1, . . . ,k,  and  (k  ♦  l)(n  -  1)  Sg  / a 

has  a  chi-squared  distribution  with  v  degrees  of  freedom  and  is  independent 

of  Z.  Thus  T  =  (T.,...,^)  has  a  standard  central  multivariate  t  distribution 

with  v  degrees  of  freedom  and  all  the  off  diagonal  elements  of  the  correlation 

matrix  equal  to  p.  Arguing  as  in  the  proof  of  Theorem  1, 

V(CS||y 2  Vo(Ti 5  cz’ 1  • 1 . k) 

*  Fk  v^c2*  * ’ * ,C2’  p)  *  P*' 

2 

Since  £  and  o  were  arbitrary,  R2  satisfies  the  P*-condition. | | 

3.2  Tables  for  c^ 

The  constant  c2  which  depends  on  k,  P*,  p  and  v  is  the  value  which  is 
tabulated  in  Krishnaiah  and  Armitage  (1966) .  The  correspondence  of  notation 
isp«k,a»l-P*,p«p  and  n  ■  v  where  the  notation  on  the  left  of 
each  equality  is  the  Krishnaiah  and  Armitage  notation  and  the  notation  on 
the  right  of  each  equality  is  the  notation  of  this  paper.  This  table  covers 
P*  ■  .95  and  .99,  k  ■  1(1)10,  p  ■  0.0(.1).9  and  v  «  5(1)35.  For  larger  values 
of  v.  Table  1  of  Gupta,  Nagel  and  Panchapakesan  (1973)  may  be  used  to  approxi¬ 
mate  c2  since  this  normal  table  corresponds  to  v  *  •  (cf.  Section  2.2). 

Gupta  (1963a)  provides  references  to  some  other  partial  tables  of  the 
multivariate  t  distribution. 


If  the  value  of  c2  for  other  values  of  k,  P*,  p  and  v  is  needed,  it 
can  be  found  by  numerical  methods  as  the  solution  of  the  equality 

OD 

/0  hv(x)  /  ®k((x^  ♦  c2x/*^r  )//T^7)*(x)  dx  dx  -  p*  (3.3) 

aflO 

where  h^(x)  is  the  chi  density  corresponding  to  v  degrees  of  freedom  for  the 
chi-squared  distribution  (see  equation  (6.7)  of  Gupta  (1963b)).  Solving 
(3.3)  should  be  more  efficient  than  solving  equation  (5.3)  of  Chen  (1980)  since 
(3.3)  involves  only  a  double  integral  whereas  Chen's  equation  involves  a 
triple  integral. 

3.3  Example 

Example  4:  Procedure  R2  is  the  same  as  the  procedure  P4  proposed  by  Chen 
(1980)  if  the  identification  is  made  that  d4  ■  c2/2  -  2rQ  where  d4  is  a 
constant  defined  by  Chen.  Chen's  procedure  was  proposed  for  a  more  general 
correlation  structure.  But  the  advantage  of  writing  the  procedure  as  R2 
is  that  c2  depends  only  on  rQ  and  r  through  p  whereas  a  separate  value  of 
d4  is  required  for  each  rQ  and  r  pair.  In  particular,  assume  rQ  =  r.  Then 
p  ■  1/2,  This  is  the  case  for  which  Table  II  of  Chen  is  provided.  Whereas 
Table  II  requires  a  separate  entry  for  each  value  of  r  (p  in  Chen's  notation), 
only  the  Krishnaiah  and  Armitage  (1966)  table  for  p  ■  1/2  is  needed  when 
procedure  R2  is  used.  The  Krishnaiah  and  Armitage  table  also  provides  per¬ 
centage  points  for  many  more  values  of  v  than  does  Table  II, 
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4.  UNKNOWN  VARIANCE,  UNKNOWN  CORRELATION  CASE 

In  this  section  the  case  is  considered  in  which  the  treatments  and 
control  have  a  common  unknown  variance  and  a  common  unknown  correlation. 

Let  Y, , . . .  ,Y  be  defined  as  in  Section  3.  Assume  r  •  rft,  that  is, 

— l  "ii  u 

the  correlation  between  a  treatment  and  the  control  is  equal  to  the  cor¬ 
relation  between  two  treatments.  But  now  assume  r  is  unknown.  This 
model  might  be  used  in  a  repeated  measures  design  in  which  each  of  the 
k  ♦  1  observations  in  Y^  are  observations  on  the  same  individual  or  experi¬ 
mental  unit.  Let  X  be  the  sample  mean  of  Y,,...,Y  .  Each  of  the  variables 

2 

Xq  -  Xi#  i  ■  l,...,k,  has  the  variance  2c  (1  -  r)/n.  Let 

Sl2  *  "  (Yoi  -  Yxi  "  CX0  -  Xj))2/ (n  -  1).  Sj2  will  be  used  as  an  estimate 

of  2o2(1  -  r). 

4.1  Selection  Procedure 

Procedure  R^:  Include  population  II ^  in  the  selected  subset  if  and  only  if 
x^  a  xQ  -  CjSj/i^n  (4.1) 

where  c3  is  chosen  to  satisfy  (4.2). 

Theorem  3:  For  a  given  P*,  if  Cj  is  chosen  to  satisfy 

Fk,  n  -  l^c3’  *  *  *  ,C3*  "  p*» 

then  Rj  satisfies  the  P*-condition. 

Proof:  Let  be  defined  by  uii  ■  Y0j  -  YU>  1  ■  1 . 

j  «  l,...,n.  Let  W  be  the  sample  mean  of  Then  »  XQ  -  X^. 

Sj2  is  the  upper  left  corner  element  of  the  sample  covariance  matrix  computed 

from  Thus  Sj2  is  independent  of  W.  The  elements  of  W  are  equally 

correlated  with  correlation  equal  to  1/2.  Using  these  facts,  the  proof  is 

now  similar  to  the  proof  of  Theorem  2.  | | 


The  constant  c3  can  be  obtained  from  the  table  of  Krishnaiah  and 

Armitage  (1966)  as  explained  in  Section  3.2.  Only  the  p  -  1/2  table  is 

needed  to  obtain  c y  The  table  of  Gupta  and  Sobel  (1957)  can  also  be 

used  to  obtain  c^.  The  correspondence  of  notation  is  p  ■  k,  P*  »  P*, 

v  ■  v  and  q//2  *  c3  where  the  notation  on  the  left  of  each  equality  is 

the  Gupta  and  Sobel  notation  and  the  notation  on  the  right  is  the  notation 

of  this  paper.  This  table  covers  P*  »  .75,  .90  and  .975,  values  not 

covered  by  the  Krishnaiah  and  Armitage  table. 

2  2 

The  use  of  S.  as  an  estimate  of  2o  (1  -  r)  is  not  entirely  satisfactory. 

2  n  - 

Any  of  the  statistics  S^  *  Z  (Yfli-  Yji  "  /(n  ~  1)#  j  *  l#...,k 

could  be  used.  was  chosen  arbitrarily.  It  would  be  good  to  combine 

2  2 
the  Sj  's  to  get  a  better  estimate.  But  the  S^  's  are  not  independent  so 

theiT  sum  may  not  have  a  chi-squared  distribution.  If  n  is  large  then 

2^2  2 
S  *  E  S.  /k  may  be  used  in  place  of  S.  in  procedure  R_  and  c_  may  be 
j«l  3  1  3  3 

approximated  by  the  value  in  Table  1  of  Gupta,  Nagel  and  Panchapakesan  (1973). 

2  2 

This  is  valid  since  S  converges  to  2a  (1  -  r)  in  probability  as  n  +  •. 
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5.  FURTHER  COMMENTS 

Each  of  the  procedures  Rj,  and  R^  have  this  form.  Include  population 
IL  in  the  selected  subset  if  and  only  if 
Xi  ,  x0  -  cSE(Xq  -  X.) 

where  c  is  an  appropriate  constant  and  SE(Xg  -  X^)  is  the  standard  deviation 
of  Xg  -  X^  or  an  estimate  thereof.  This  form  reduces  the  number  of  para¬ 
meters  upon  which  the  constant  c  depends.  For  example,  in  Section  2  the 
constant  c  does  not  depend  on  the  parameter  y  whereas,  if  the  rule  is  written 
in  the  form  of  Chen  (1980),  the  constant  does  depend  on  y.  Berger  and 
Gupta  (1980)  found  that  the  use  of  the  standard  deviation  of  the  differences, 
Xq  -  X^,  had  other  advantages  in  a  different  subset  selection  problem.  This 
consideration  of  the  differences  as  the  important  variables  and  use  of 
their  standard  deviations  may  be  advantageous  in  other  similar  problems. 
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